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Abstract:   An  algorithm  for  selection  of  knot  point  locations  -for 
approximation  of  functions  from  large  sets  of  scattered  data  by 
least  squares  Thin  Plate  Splines  is  given.   The  algorithm  is 
based  on  the  idea  that  each  data  point  is  equally  important  in 
defining  the  surface,  which  allows  the  knot  selection  process  to 
be  decoupled  from  the  least  squares-   Properties  of  the  algorithm 
are  investigated,  and  examples  demonstating  it  a.re    given. 
Results  of  some  least  squares  approximations  are  given  and 
compared  with  other  approximation  methods. 
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1.0   INTRODUCTION 

The  problem  o-f  -fitting  a  surface  to  small  sets  of  given  data 
has  been  addressed  in  many  different  ways  and  several  computer 
programs  are    currently  available  which  enable  one  to  deal  with 
the  problem  effectively.   Many  of  the  methods  available  involve  a 
global  interpolation  or  approximation  scheme  and  often  involves 
solving  a  system  of  equations  with  an  equivalent  number  of 
unknowns.   For  very  large  sets  of  data,  the  problem  is 
computationally  intractable.   This  consideration  provides  the 
motivation  behind  the  development  of  a  way  to  pare  the  problem 
down  to  a  more  manageable  size. 

We  wish  to  construct  a  function  F  which  approximately  fits 
the  data  since  we  assume  the  data  collection  is  subject  to 
measurement  error.   We  propose  to  use  approximation  by  least 
squares  Thin  Plate  Splines  (TPS),  where  the  surface  function  is 
constructed  so  as  to  minimize  an  error  function  subject  to 
certain,  constraints.   Solving  the  approximation  problem  will  also 
involve  as  many  equations  as  there  are    data  points,  but  the 
number  of  unknowns  will  be  significantly  fewer.   Part  of  the 
appeal  of  TPS  approximation  lies  in  the  fact  that  it  minimizes  a 
certain  linear  functional,  and  involves  a  linear  combination  of 
functions  with  no  greater  complexity  than  the  natural  logarithm 
of  the  distance  function. 

Interpolation  of  scattered  data  by  the  method  of  TPS  was 
developed  from  engineering  considerations  by  Harder  and  Desmarais 
C51.   It  can  be  thought  of  as  a  two  dimensional  generalization  of 
the  cubic  spline,  which  models  a  thin  beam  under  point  loads 
subject  to  equilibrium  constraints.   The  TPS  function  is  derived 


-from  a  differential  equation  which  gives  the  deformation  of  an 
infinite,  thin  plate  under  the  influence  of  point  loads.   A  point 
load  is  applied  at  each  data  point  so  that  the  interpolating 
surface  can  be  constructed  as  a  sum  of  fundamental  solutions  of 
the  TPS  equation. 

In  using  the  least  squares  TPS  approximation  method  to  fit 
the  surface,  a  fewer  number  of  basis  functions  than  the  number  of 
given  data  points  is  employed.   These  basis  functions  are 
centered  at  a  different,  smaller  set  of  points,  which  in  analogy 
with  the  univariate  case,  we  call  the  knots.   Therefore,  the 
problem  at  hand  is  one  of  selecting  the  knot  points,  and  hence 
the  basis  functions.   This  approach  differs  from  the  use  of 
smoothing  splines,  which  were  introduced  by  Wahba  and 
Wendelberger  C9D  in  the  multidimensional  case,  and  called 
Laplacian  Smoothing  Splines  (LSS> .   LSS  minimize  a  certain 
functional  which  is  a  linear  combination  of  a  term  measuring 
f ideli ty— to  the  data  and  one  measuring  smoothness  of  the  function 
(a  generalization  of  the  usual  thin  plate  spline  functional).   In 
this  case  there  is  still  one  basis  function  for  each  data  point, 
but  the  interpolation  condition  is  relaxed. 

Given  a  'large'  set  of  data  points,  <x.,y.,f.),  i  =  1,...,N, 
we  wish  to  find  a  smaller  set  of  knot  points,  (£.,y.),  j  = 
1,...,K,  which  will  'represent'  the  former  reasonably  well.   This 
could  be  accomplished  by  choosing  a  subset  of  the  original  set, 
or  by  some  process  which  produces  a  representative  set.   The 
ultimate  goal  is  to  approximate  the  surface  from  which  the  origi- 
nal data  arose  using  the  representative  set.   Hence,  a  surface 
fit  to  the  large  set  and  one  fit  to  the  representative  set  should 


be  essentially  the  same. 

Approximation  by  least  squares  TPS  is  straightforward,  once 
the  knot  points  Are    known.   We  construct  the  TPS  function 

K      2 
F<x,v)  =   Z  A  d  .^loq (d  . )  +  ax  +  by  +  c 
j=l  J  J        J 

*■?  *~»  *■> 

where  d  .   =  (x-X.)^    +     <y-y.)"",  and  the  coefficients  A.  are  chosen 
J  J  J  J 

to  minimise  the  error  function  _ 

N 

E  =   Z  CCF<x.,y.)  -  f.3/5.32  . 

t  1  * 7 1  l    x 

1  =  1 

The  ordinates,  f . ,  may  be  subject  to  random  error^^     say  with 

standard  deviation,  s. ,  at  the  i    data  point.   We  model  the 

plate  under  the  point  loads  at  the  knot  points  (as  opposed  to  the 

data  points);  therefore  the  constraint  equations  for  the  least 

squares  TPS  method,  which  may  be  thought  of  as  'equilibrium 

conditions'  on  the  plate  should  be  satisfied.   Thus,  the  error 

function  is  minimized  subject  to  the  constraint  equations: 

K  — 

Z    A  .  =  0 
j  =  l  J 

jLvj  -  ° 

K 

Z    A  .y  .  =  0 
j  =  l  J  J 

We  use  LINPACK  C1D  subroutines  to  do  the  actual  calculations. 

Previous  attempts  have  been  made  to  minimize  the  error 

function  by  considering  it  to  be  a  function  of  the  knot  point 

locations  as  well  as  the  coefficients,  wherein  a  total  of  3K 

parameters  are  involved.   As  reported  on  by  Schmidt  C83,  the 

initial  knot  configuration  was  taken  to   be  of  tensor  product 


form.   The  overall  minimization  process  is  a  large  non— linear 
one,  and  is  complicated  by  possible  coalescense  of  knots  as  well 
as  non— unique  solutions  (as  indicated  by  consideration  of  one- 
dimensional  cases).   Also,  the  objective  function  may  have  many 
local  minima  so  that  avoiding  poor  local  minima  or  searching  for 
better  local  minima  may  be  necessary.   Because  of  these  kinds  of 
problems,  our  goal  is  to  decouple  the  knot  selection  process  from 
the  least  squares  process. 

When  data  are    somewhat  uniformly  distributed,  methods  invol- 
ving tensor  product  cubic  splines  may  be  desirable.   Tensor  pro- 
duct methods  place  knot  locations  on  a  grid,  which  may  not 
reflect  the  actual  disposition  of  the  data  points;   in  fact, 
there  could  be  no  data  nearby.   Furthermore,  even  though  these 
problems  are  surmountable,  they  could  lead  to  nonuniqueness  of 
solutions  and  a  minimum  norm  solution  that  may  not  be 
aesthetically  appealing. 

A  ilLfferent  point  of  view  is  considered  here  wherein  the 
knot  point  locations  are  predetermined  based  on  two  criteria. 
Specifically,  we  shall  make  assumptions  relating  the  density  of 
data  to  the  dependent  variable  and  mandating  the  importance  of 
each  individual  data  point.   Solution  of  the  overdeter mined  sys- 
tem of  equations  follows  the  knot  point  selection.   A  summary  of 
the  approach  and  results  will  be  presented.   Examples  are    given 
which  illustrate  rather  well  the  ability  of  the  scheme  to  select 
knot  locations  which  reflect  the  underlying  density  of  the  data. 
Actual  surface  fitting  and  comparison  with  two  other  methods,  the 
Laplacian  Smoothing  Splines  of  Wahba  and  Wendelberger  C9D,  and 
the  tensor  product  bicubic  Hermite  method  due  to  Foley  C33  Are 


also  reported  on. 

We  also  point  out  two  related  ideas  which  a^re    attempts  to 
decrease  the  number  o-f  basis  functions  to  be  considered-   Each  is 
an  attempt  to  choose  a  subset  o-f  points  to  be  used  to  construct 
the  approximation.   Schiro  and  Williams  C7D  used  an  adaptive 
process  for  subset  selection  and  overlapping  regions  to  construct 
underwater  topographic  maps.   Bozzini,  diTisi,  and  Lenarduzzi  12.1 
gave  an  algorithm  -for  selecting  a  subset  o-f  points  which  were 
important  to  proper  definition  of  the  surface.   Both  of  these 
methods  made  no  assumptions  about  the  density  of  the  data  points 
relative  to  the  behavior  of  the  surface,  and  required 
consideration  of  the  ordinate  values. 

2.0   THE  KNOT  SELECTION  PROCESS 

Given  'a  priori'  flexibility  in  knot  placement,  the  problem 
becomes  the  selection  of  knot  locations,  followed  by  solution  of 
the  system  by  least  squares.   Since  the  selection  of  knot 
locations  is  to  be  decoupled  from  the  solution  of  the  least 
squares  problem,  some  assumptions  must  be  made  in  order  to  devel- 
op an  algorithm  for  the  knot  selection  process.   First,  we 
assume  that  the  independent  variable  data  reflects  something 
about  the  behavior  of  the  dependent  variable.  For  example,  the 
density  of  the  data  points  may  be  dependent  on  the  curvature  of 
the  surface.   Hence,  where  relatively  many  data  points  are  found, 
the  function  is  assumed  to  be  changing  behavior  rapidly,  whereas 
a  low  density  of  data  indicates  slowly  changing  behavior. 
Although  this  assumption  is  not  universally  satisfied  in 
practice,  it  does  not  seem  to  be  an  unreasonable  one. 


The  second  assumption  is  that  each  data  point  is  equally 
important  in  defining  the  underlying  surface.   Therefore  the 
number  of  data  points  represented  by  each  knot  should  be  the  same 
or  nearly  the  same.   This  leads  to  'equal  representation'  of  the 
data  points  by  the  knot  points  where  each  data  point  is  'close' 
to  a  knot  point.   A  key  advantage  achieved  in  pursuing  this 
approach- is  the  existence  of  a  natural  heuristic  for  moving  the 
knots  around  the  plane  in  searching  for  a  good  knot 
configuration.   This  point  will  be  elaborated  on  later  in  the 
paper. 

Our  knot  selection  algorithm  is  based  on  these  last  two 
assumptions.   First,  we  wish  to  minimize  the  sum  of  the  distances 
squared  from  each  data  point  to  the  nearest  knot  point;   that  is, 
minimize  the  'global '  value, 

N  2  2 

GN^   =  Z    min  C<x.-£.)   +  (y.-O.)  1 

1  =  1  j 
This  function  is  continuous  and  piecewise  quadratic.   The 
expression  leads  naturally  to  a  'default'  Dirichlet  Tesselation, 
a  partitioning  of  the  plane  with  respect  to  the  knot  points  (see 
Figure  1).   Thus,  we  say  each  data  point  belongs  to  some  knot 
point  according  to  the  Dirichlet  tile  in  which  it  lies.   Data 
points  on  any  of  the  tile  boundaries  (ties)  must  be  resolved  by  a 
determination  of  which  tile  they  belong  to  or  some  sharing 
mechanism.   Our  initial  guess  at  the  knot  point  configuration  was 
taken  to  be  quasi-gri dded. 

Differentiation  of  GN^  with  respect  to  the  &  .  and  y   show 
that  at  the  minimum,  each  knot  point  will  occupy  the  centroid 
with  respect  to  the  data  points  inside  that  tile.   Given  the 


initial  configuration  o-f  knot  points  with  its  Dirichlet 
Tesselation,  the   -following  algorithm  for  iteration  to  a  local 
mini  mam  GN~"  value  is  employed: 

(a)  compute  the  centroid  of  each  tile  with  respect  to  the 
data  points  contained  within  each  tile; 

(b)  move  the   knots  to  the  corresponding  centroids,  which 
results  in  a  new  Dirichlet  Tesselation  and  a  new  set  of  knot 
point  -  data  point  associations;   this  is  the  configuration  for 
the  next B iteration.  _ 

(c)  quit  when  two  successive  iterations  yield  the  sa,me  knot 
locations,  which  means  that  a  local  minimum  value  of  GN^  has  been 
found. 

This  algorithm  was  formulated  in  discussions  at  the  Istituto  per 

le  Applicazioni  della  Matematica  e  del 1 ' Inf ormatica  in  1983  C103, 

after  the  problem  was  posed  by  G.  Nielson  and  R.  Franke. 

2 

We  note  that  the  value  of  GN"1"  will  necessarily  decrease  as 

the  iterations  continue,  until  two  successive  iterations  yield 

the  same  configuration;   this  will  be  proven  below.   In  the  case 

where  no  data  points  lie  in  a  tile  for  some  knot  point,  the  knot 

point  is— moved  to  the  nearest  data  point.   This  mechanism  avoids 

knots  without  data  points.   Futhermore,  if  a  data  point  lies  on  a 

tile  boundary,  it  is  assigned  to  the  knot  with  the  smallest 

subscript   (amongst  the  appropriate  choices  of  knot  points). 

Employment  of  a  different  criterion  for  the  resolution  of  ties 

may  yield  different  results.   We  note  that  knots  cannot  coalesce. 

The  fallowing  theorem  is  pertinent  to  this  algorithm. 

Theorem:   The  function  GN"~  decreases  with  each  iteration  which 
involves  movement  of  a  knot  point. 

Proof:   Write  GN^   in  the  more  convenient  form 

K 

GN2  =  Z  Z  [(*.-<>  >2  +  <y. -y  )2:  (1) 


where  I    =   {i:  (xi,y.)  in  the  tile  for  <#.,?>}.   In  (1),  the 
interior  sum  is  the  sum  of  the  distances  squared  from  the  data 
points  in  a  tile  to  the  knot  point  in  that  tile,  and  the  exterior 
sum  is  over  all  K  o-f  the  tiles.   Let  a  prime  denote  the  new  knot 
points  and  index  sets.   This  -form  leads  to  the  expressions 

<*;,y;>    =  <  E  x  /n  ,  E  V./n  )  , 

J   J      itl  .      i€I  .    J 
J         J 

where  n  .  is  the  number  o-f  indices  in  the  set  I  ..   The  new  knot 
J  J 

points  will  lead  to  a  new  tesselation,  -fallowed  by  the  new  index 
sets  I'..   Then  the  expression  (1)  is  greater  than  or  equal  to 

K  2  2 

E    E  C  <*.-£'.)   +  Cy.-y'.>  D  (2) 

because  the  new  knot  point  locations  minimize  the  contribution  of 
the  interior  sums.  This  expression  (2),  in  turn,  is  greater  than 
or  equal  to 

E    E  L(x.-fr.)       +  (y.-y.)^3  (3) 

j=l  iclj   x   J        x  Yj 

since  an  index  i  moves  to  another  set  only  in  the  case  wherein 


the  corresponding  data  point  is  now  closer  to  a  different  knot 

2 
point,  thus  decreasing  its  contribution  to  the  global  GN   value. 

2 

Finding  a  local  minimum  of  GN""-  is  well— served  by  this  algo- 
rithm; however,  as  seen  in  a  one  dimensional  example  C63,  the 

2 
function  GN   is  rife  with  local  minima,  and  the  local  minimum 

value  found  depends  on  the  initial  configuration  of  knots  used. 

We  can  draw  similar  conclusions  for  the  multidimensional  case 

based  on  the  one  dimensional  analogy. 

The  process  of  locating  each  knot  occurs  in  two  distinct 

steps:   first,  each  data  point  is  assigned  to  the  closest  knot 


8 


and  second,  a  determination  is  made  within  each  tile  as  to  the 

location  of  its  centroid.   As  a  direct  result  of  the  centroid 

2 
requirement,  the  GN   function  will  stabilize  at  the  local  minimum 

value  corresponding  to  the  particular  quadratic  piece  on  which 

the  knot  is  -found. 

The  local  minimum  value  will  -frequently  occur  out  o-f  the 

domain  o-f  the  corresponding  quadratic  piece.   This  leads  to  a 

'cascading'  phenomenon  which  continues  until  a  local  minimum 

value  is  attained.   However,  the  global  minimum  value  will  not 

necessarily  be  attained  since  the  cascading  will  stop  as  soon  as 

a  local  minimum  value  is  found  within  the  domain  of  each 

quadratic  piece. 

This  inconsistent  performance  of  the  algorithm  in  finding 

2 
the  global  minimum  value  of  GN   leads  to  consideration  of  a 

somewhat  different  criterion  for  locating  a  good  configuration  of 
knot  points.   We  wish  to  exploit  the  second  assumption  specified 
earlier,_while  taking  advantage  of  the  minimization  of  the  GN^ 
function.   Since  each  data  point  is  assumed  to  be  equally  impor- 
tant, the  Dirichlet  tile  for  each  knot  should  contain  about  the 
same  number  of  data  points.   Thus,  we  wish  to  minimize  the  sum  of 
the  squares  of  the  differences  between  the  number  of  knots  in 
each  tile  and  the  average  number  of  data  points  that  should 
belong  to  each  tile;   that  is,  minimize  the  quantity 

K 

D  =  Z     <n   -  N/K>" 

j  =  l   J 
The  new  algorithm  for  determining  knot  locations  is  based  on  the 
minimization  of  D,  subject  to  the  constraint  that  each  knot  be 
located  at  the  centroid  of  its  tile. 


This  new  optimization  leads  to  a  natural  heuristic  for 

moving  knots  from  a  stable  configuration  to  a  possibly  better 

con-figuration.   We  call  the  current  con-figuration  of  knots  a 

'base'  configuration,  and  iterate  through  the  algorithm  as 

f ol lows: 

(a)   generate  a  new  guess  for  the  knot  locations  by  moving 
the  knot(s)  with  the  smallest  number  of  data  points  in  their 
tile(s)  toward  the  knot (s>  with  the  largest  number  of  knot(s)  in 
their  tile(s);  the  distance  moved  is  initally  a  large  fraction  of 
the  total  distance  between  the  knots. 

(b)  iterate  to  a  stable  configuration  using  the  first 
algorithm,  compute  the  values  of  GN"~  and  D,  and  compare  D  to  the 
smallest  value  achieved  to  date,  as  represented  by  that  of  the 
base  configuration; 

(c)  repeat  the  process  above  when  a  smaller  value  of  D  is  ob- 
tained, with  the  present  configuration  as  the  base  configuration; 

(d)  when  a  smaller  value  of  D  is  not  found,  take  a  shorter 
step  in  the  movement  of  the  knot(s)  and  repeat  the  process  above; 

(e)  continue  with  smaller  and  smaller  steps  until  a  smaller 
value  of  D  is  found  (or  an  equal  value  of  D  with  a  smaller  GIM^ 
value)  or  until  the  knot  locations  return  to  the  base 
configuration; 

(f )  perform  the  search  in  the  symmetrical  way  when  the  base 
configuration  is  returned  to;   that  is,  move  the  knot(s)  with  the 
largest  number  of  data  points  in  their  tiles  toward  the  knot(s) 
with  the  smallest  number  of  points  in  their  tile; 

(g)  quit  when  no  smaller  value  of  D  is  found. 

The  movement  of  the  knots  is  justified  by  the  rationale  that 
a  more  equitable  distribution  of  data  points  can  be  found  by 

moving  the  tile  boundaries  across  data  paints.   Note  that  the 

2 
algorithm  for  computing  a  local  minimum  of  the  GN   function  value 

is  embedded  in  this  new  algorithm. 

3.0   RESULTS  AND  EXAMPLES 

Using  our  algorithm  for  the  a  priori  selection  of  the  knot 


10 


point  locations,  experiments  were  conducted  to  test  the  scheme 
using  different  sets  of  test  data.   This  was  followed  by 
verification  of  the  scheme  on  a  large  set  of  real  data.   Results 
from  two  sets  of  the  test  data  Are    presented  here:   one 
consisting  of  200  data  points  called  'Cliff',  and  one  consisting 
of  500  data  points  called  'Humps  and  Dips'.   Both  sets  of  data 
were  generated  using  known  functions  (see  Franks  C41)  in  a  way 
that  farced  the  density  of  points  to  be  proportional  to  the 
curvature  of  the  sampled  function.   Figures  2-5  show  these  two 
test  data  sets  graphically,  and  illustrate  the  optimized  knot 
point  configurations  found  using  the  least  squares  algorithm. 
Figure  6  depicts  actual  hydrographic  data  collected  in  Monterey 
Bay,  with  greatly  varying  density.   Figure  7  shows  the  results  of 
applying  the  algorithm  with  100  knots,  and  is  particularly 
interesting  because  the  density  of  the  data  is  faithfully 
replicated  by  the  knots.   We  note  that  the  assumption  regarding 
the  density  of  the  data  points  being  indicative  of  the  behavior 
of  the  dependent  variable  is  not  actually  satisfied  in  this  case, 
due  to  the  source  of  the  data.   Nonetheless,  these  results 
demonstrate  the  ability  of  the  algorithm  to  produce  representa- 
tive sets  of  knots. 

We  also  investigated  how  closely  the  constructed  surface  F 
and  the  'true'  surface  resemble  one  another.   This  comparison  is 
made  in  the  context  of  the  root-mean— squared  error  (RMS)  of  the 
residuals  (at  the  data  points)  and  on  a  rectangular  grid  of 
locations  in  the  plane.   The  two  data  sets  constructed  above,  and 
one  other  which  was  generated  in  a  similar  manner,  were  used  to 
compare  RMS  errors  for  the  least  squares  algorithm  developed  here 
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with  the  method  of  Wahba  and  Wendelberger  (Laplacian  Smoothing 
Splines),  and  the  method  o-f  Foley  (Bicubic  Hermite  Tensor  Product 
Spl ines) . 

The  dependent  variable  values  o-f  the  experimental  data  sets 
were  generated  in  two  ways:   1)  using  a  known  function,  and   2) 
contaminating  the  known  function  by  the  injection  of  independent, 
identically  distributed  normal  random  errors  with  a  specified 
composite  standard  deviation  of  0.05.   The  actual  standard 
deviation  was  about  0.0485.   In  the  first  case,  we  would  expect 
the  RMS  error  on  the  data  points  and  on  the  grid  to  be  about  the 
same,  and  to  decrease  as  the  number  of  knot  points  used  to 
represent  the  data  is  increased. 

In  the  contaminated  case,  the  dependent  variable  at  each  data 
point  is  the  sum  of  the  unknown  underlying  function  value  and  the 
error  function  value  so  that  the  difference  between  the 
constructed  surface  and  the  'true'  surface  is  mainly  attributable 
to  the  presence  of  error  in  the  data.    In  the  best  situation,  we 
expect  the  RMS  error  in  the  residuals  to  match  the  composite 
standard  deviation  of  the  random  error  injected  to  obtain  the 
contaminated  data.   At  the  grid  points,  we  expect  the  RMS  error 
to  be  smaller  than  the  composite  standard  deviation,  since  the 
grid  sample  is  larger  (33*33)  and  the  errors  are    distributed  more 
evenly  throughout  the  entire  region  of  interest. 

Some  observations  can  be  made  regarding  Tables  1—3.   The 
general  trend  of  the  RMS  error  on  both  the  data  points  and  the 
grid  is  to  diminish  as  the  number  of  knot  points  is  increased. 
As  expected  with  the  exact  data,  the  RMS  error  of  the  residuals 
and  the  RMS  error  on  the  grid  are    roughly  equivalent.   For  the 
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contaminated  data,  the  RMS  error  of  the  residuals  roughly  matches 
the  composite  standard  deviation  o-f  the  data,  and  the  RMS  error 
on  the  grid  is  smaller  than  the  RMS  error  o-f  the  residuals,  as 
expected.   In  Table  1,  the  errors    over  the  grid  increase  as  the 
number  of  knot  points  is  increased,  and  the  RMS  value  of  the 
residuals  becomes  less  than  the  composite  standard  deviation  of 
the  injected  errors.   In  this  case,  under smoothing  has  occurred, 
and  the  surface  is  "drawing"  toward  the  error. 

In  comparing  least  squares  TPS  to  the  smoothing  spline 
method  in  the  exact  data  case,  we  note  that  the  smoothing  spline 
method  yields  a  residual  RMS  srrar    of  0.   This  could  be  expected, 
since  there  is  no  error  in  the  data  and  the  spline  of 
interpolation  is  chosen.   On  the  grid,  the  RMS  error  is  small. 
When  the  data  is  not  contaminated,  the  RMS  error  of  the  least 
squares  algorithm  only  begins  to  become  as  small  as  that  of  the 
smoothing  splines  method  when  the  number  of  knots  used  becomes 
large.   We  also  note  that  in  the  500  data  point  set,  no 
comparison  is  made  since  a  potential  limit  for  computing 
smoothing  splines  is  200-300  data  points. 

Foley's  method  for  the  contaminated  case  gives  errors  nearly 
equal  to  the  composite  standard  deviation  of  the  errors  injected 
into  the  data.   However,  on  the  grid,  the  least  squares  method 
does  better,  an  indication  that  smoothing  is  occurring,  as 
expected.   We  also  note  that  an  increase  in  the  number  of  grid 
points  does  not  significantly  improve  the  RMS  error  in  Foley's 
method,  even  though  an  increase  in  the  number  of  knots  in  the 
least  squares  method  usually  yields  improved  results.   We  used 
the   default  local  approximations  in  Foley's  method,  and  we  note 
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that  performance  o-f  the  method  may  be  improved  by  using  lower 
degree  local  approximations  to  estimate  the  grid  values  to  be 
used. 

Finally,  we  note  that  the  search  -for  a  best  knot  con-figura- 
tion can  turn  out  to  be  rather  expensive.    For  a  large  number  of 
data  points  with  a  moderately  large  number  o-f  knot  points,  the 
computational  e-f-fort  could  be  excessive,  although  we  are 
investigating  ways  o-f  speeding  up  the  algorithm.   Furthermore,  as 
we  noted  earlier,  the  end  results  are  dependent  on  the  initial 
guess,  although  they  generally  look  quite  good  -for  any  reasonable 
initial  guess. 
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METHOD 


NUMBER  OF 
DATA  POINTS/ 
KNOT  POINTS 


NO  ERRORS 
IN  DATA 

RESIDUAL    GRID 


CONTAMINATED  DATA 


RESIDUAL 


GRID 


LSTPS 
LSTPS 
FOLEY 
LSTPS 
FOLEY 
SMOOTHING 


200/20 

200/25 

200/5x5 

200/35 

200/6x6 

200 


0.01562 
0.01179 
0.00777 
0.00626 
0.00512 
0.0 


0.01474 
0.01154 
0 . 006 1 3 
0.00616 
0.00417 
0.00096 


0.05214 
0.04305 
0.05996 
0.04590 
0.05113 
0.04272 


0.01795 
0.02040 
0. 04819 
0.02146 
0.03745 
0.01806 


Table  Is   Comparison  of  RMS  errors  on  'CLIFF'  200  points 


METHOD 


NUMBER  OF 
DATA  POINTS/ 

KNOT  POINTS 


NO  ERRORS 
IN  DATA 

RESIDUAL    GRID 


CONTAMINATED  DATA 


RESIDUAL 


GRID 


LSTPS 
LSTPS 
FOLEY 
LSTPS 
FOLEY 
SMOOTHING 


200/20 

200/25 

200/5x5 

200/35 

200/6x6 

200 


0.05525  0.05465 

0.02520  0.02646 

0.01206  0.01332 

0.01662  0.01843 

0.00968  O.OU44 


0.0 


0.00254 


0.07571  0.05866 

0 . 05603  0 . 03385 

0.04819  0.04965 

0.05274  0.02853 

0.05028  0.03962 

0.03900  0.02789 


Table  2:   Comparison  o-f  RMS  errors  on  'HUMP  AND  DIPS'  200  points 
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METHOD 


NUMBER  OF 
DATA  POINTS/ 
KNOT  POINTS 


NO  ERRORS 
IN  DATA 


RESIDUAL    GRID 


CONTAMINATED  DATA 


RESIDUAL    GRID 


LSTPS 
L5TPS 
FOLEY 
LSTPS 
FOLEY 
Tab 1 e  3 : 


500/20  0.02402  0.02517 

500/25  0.01664  0.01766 

500/5x5  0.01346  0.01230 

500/50  0.00645  0.00S45 

500/7x7  0.00645  0.00552 


0.05256  0.02738 

0.04818  0.02283 

0.05844  0.03767 

0.04544  "0.01961 

0.05696  0.04864 


Comparison  o-f  RMS  errors  on  'HUMPS  &  DIPS'  500  points 
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Figure  1:  A  Dirichlet  Tesselation  with  5  Tiles.  It  is 
constructed  by  connecting  the  perpendicular  bisectors  o-f  the 
lines  joining  each  o-f  the  knot  points. 
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